Shareware Overload Trio 2

home *** CD-ROM | disk | FTP | other *** search

/ Shareware Overload Trio 2 / Shareware Overload Trio Volume 2 (Chestnut CD-ROM).ISO / dir28 / st-size.zip / SIZE.DOC < prev next >

Wrap

Text File | 1992-12-01 | 42KB | 992 lines

PC-SIZE: Consultant A Program for Sample Size Determinations Version 1.01 (c) 1990 "One of many STATOOLS(tm)..." by Gerard E. Dallal 54 High Plain Road Andover, MA 01810 PC-SIZE: Consultant is a prompt driven program that calculates the sample size requirements for confidence intervals and tests of significance. When constructing a confidence interval for a single population mean or for the difference between two population means based on independent samples, PC-SIZE calculates the sample size necessary to insure with a prespecified amount of probability that the interval does not exceed a specified length. For tests of significance, PC-SIZE calculates the sample size needed to achieve a specified amount of power and calculates the power of specific sample sizes for three experimental situations: comparing the means of two independent samples, comparing the means of paired data, and comparing proportions in two independent samples. Why obtain sample size estimates? Observations cost time, money, and manpower. It is wasteful to collect more data than necessary. It is even more wasteful to undertake a study where sample size calculations would have revealed the impossibility of collecting enough data to answer the research question. NOTICE Documentation and original code copyright 1989 by Gerard E. Dallal. Please acknowledge PC-SIZE: Consultant in any manuscript that uses its calculations. DISCLAIMER STATOOLS are provided "as is" without warranty of any kind. The entire risk as to the quality, performance, and fitness for intended purpose is with you. You assume responsibility for the selection of the program and for the use of results obtained from that program. PAGE 2 TABLE OF CONTENTS Background................................................. 2 Introduction............................................... 3 Operation.................................................. 4 Confidence Intervals................................... 5 Tests of Significance.................................. 6 Independent samples................................ 6 Unequal sample sizes........................... 7 Paired data........................................ 7 Proportions........................................ 8 Power of specific sample sizes............................. 9 Reports.................................................... 9 Technical details.......................................... 9 Initial approximation--Confidence intervals............ 9 Initial approximation--Tests of significance........... 10 Tests of relative differences in population means...... 10 Validation............................................. 11 Other applications: Tests of significance.................. 11 Two period cross-over design........................... 11 Comparing a single sample to a known standard.......... 11 Pre- to Post-treatment changes in two independent samples............................................. 11 Other issues............................................... 13 Simplifying the experiment............................. 13 Repeated measurements over time........................ 13 Two-tailed tests....................................... 13 Algorithms................................................. 14 References................................................. 15 Duplication and shareware notices.......................... 16 Registration form.......................................... 16 BACKGROUND PC-SIZE, version 1.0, was written in 1985 to help with my work as a statistical consultant. It saved me from having to perform sample size calculations by hand on the spot (with potentially disastrous consequences from a slight slip) and enabled me to avoid breaks in continuity from ending consulting sessions prematurely so that I could obtain sample size estimates in unpressured surroundings. PC-SIZE, then, was written for someone familiar with the theory of sample size estimation who needed a way to carry out the calculations quickly and reliably. In 1986, PC-SIZE was made available to others who wanted this capability at their fingertips. (Dallal, 1986). The original PC-SIZE proved to be an effective teaching PC-SIZE G.E. Dallal PAGE 3 tool. Students were freed from the burden of grappling with formulas and carrying out calculations by hand, but almost immediately they began to ask for a program that explained itself in more detail. That program is PC-SIZE: Consultant. The estimates from PC-SIZE: Consultant will be the same as those obtained from the original PC-SIZE, but the operation of PC-SIZE: Consultant should prove more straightforward to the casual user. Sample size calculations can now be performed not only by professional statisticians but also by investigators who initiate the research. INTRODUCTION Welcome to PC-SIZE: Consultant. With this program, you will be able to obtain sample size estimates for a wide variety of experimental situations. No computer program can turn you into a trained statistician, but if a task is sufficiently narrow and well-defined (such as calculating a sample size estimate) it is possible to duplicate what goes on in a consulting session with a professional statistician. The most important rule for achieving a successful result is, "Be honest." If your experiment is one of those that PC-SIZE can handle, the program will serve you well. If you try too hard to shoehorn your study into this program, the resulting estimates may have no relevance to your situation. PC-SIZE is no different from any other self-help tool. It gives you the opportunity to act without having an expert on hand, but it places on you the responsibility for knowing when you are overextending yourself. There are many fine books on medicine, home repairs, and auto mechanics, but the reader has to know when it is time to put the book aside and call in the doctor, carpenter, electrician . . . or statistician. The second most important rule for achieving a successful result is, "Be honest." You will be asked for your best guess about many aspects of the experiment you are considering. Sample size estimates can change dramatically with the values you specify in response to the prompts. Decrease the estimate of the measurement error by 30%, for example, and the sample size estimate is cut in half. It might seem tempting to give optimistic estimates so that the sample size requirement will be reduced to manageable numbers, but the only thing such estimates accomplish is to divert valuable resources to projects that should never have been undertaken. Your feedback is appreciated. Please let me know if the documentation or program prompts contain any ambiguities or if PC-SIZE G.E. Dallal PAGE 4 there is anything that could have been explained more fully. I do not expect to broaden the scope of the program , but I want PC-SIZE to excel at what it does. OPERATION PC-SIZE: Consultant is a prompt driven program. Any quantity that appears in square brackets is a default value that can be obtained by pressing the Enter key. PC-SIZE gives sample size estimates for confidence intervals to be shorter than a given length with a specified amount of probability. The confidence intervals can be for a single population mean or for the difference between two population means based on independent samples. Confidence intervals for the difference between two population means based on paired samples can be obtained by using the single mean option and answering the prompts in terms of the paired differences. PC-SIZE gives sample size estimates and performs power calculations for three hypothesis testing situations: 1. comparing means from two independent samples. Samples are independent when there is no special relationship between the experimental units in the two samples. 2. comparing means from paired samples. In paired samples there is a relationship between the experimental units. Two measurements may be made on the same individual, on twins, on siblings, or on spouses. 3. comparing two independent proportions. This is similar to situation 1, except that the proportion of units possessing a particular characteristic--for example, the proportion of individuals with heart disease--is the measurement of interest. These few situations describe the vast majority of scientific experiments. Theorists have been able to derive formulas for almost any special situation but, in practice, the less complicate formulas often serve us better. Consider, for example, a comparison involving three treatments. The hypothesis of no treatment differences is usually tested by using analysis of variance techniques. There is a formula that specifies the sample size needed to test this hypothesis, but the numbers may not be adequate for a complete analysis of the data. The sample size estimate PC-SIZE G.E. Dallal PAGE 5 says how many subjects are needed to achieve a significant result when testing whether the three samples come from populations with equal means, but the sample size may be too small for determining the specific differences among the three groups. Suppose 3 treatment groups are expected to have mean responses of A=1.0, B=2.0, and C=2.5, with a within group standard deviation of 1.0 . The sample size formula for analysis of variance says that the experiment must include 10 individuals per group to have an 80% chance of establishing that the means of the three groups are not all equal. A sample size of 10 per group gives an 88% chance of establishing that A and C have different means, but it gives only a 56% chance of establishing that A and B are different and a 19% chance of establishing that B and C are different. If the difference between B and C is the main reasons for the study, a sample size of 10 per group is much too small. A sample size of 64 per group would be more appropriate. CONFIDENCE INTERVALS The sample size needed to insure that a confidence interval will be no longer than a prespecified value depends on 4 quantities: 1. The amount of confidence in the interval. All other things being equal, more observations are needed to produce an interval of greater confidence. 2. The length of the interval. All other things being equal, more observations are needed to produce a shorter interval. 3. The probability that the interval does not exceed the specified length. More observations are needed to increase the probability that the interval does not exceed the specified bound. 4. The variability in the individual observations. All other things being equal, the greater the variability in the data, the more observations needed. Variability may be specified as either a standard deviation or as an interval that contains a specified percentage of the data. Most sample size formulas are written in terms of the standard deviation, but researchers are often uncomfortable estimating standard deviations. Investigators usually find it easier to specify the width of PC-SIZE G.E. Dallal PAGE 6 an interval that is likely to contain most of the data. PC-SIZE asks for the percentage of the data in the interval and then for the width of the interval. PC-SIZE then computes the normal percentile z that contains the specified probability between -z and +z. The standard deviation is estimated as the width divided by 2*z since, for normally distributed data, the specified percentage of observations will be within z standard deviations of the population mean. TESTS OF SIGNIFICANCE Independent Samples The sample size estimate depends on 4 quantities: 1. the level at which the null hypothesis of equal population means will be tested. This is typically 0.05 or 0.01. The level of the test is the probability of reporting a difference if there is really none. All other things being equal, more observations are needed to reduce the probability of making such an error. 2. the likely difference between the means of the two populations from which the samples are drawn. The smaller the difference, the larger the sample size needed to establish the difference. The likely difference between the means of the two populations may be specified in absolute units (e.g., 10 mg/dl, 43 days) or as a percent change. When the difference is in absolute units, PC-SIZE: Consultant gives the option of specifying the individual means or their difference. Note: The sample size estimate depends on the sign of the percent difference. A difference of -6%, for example, gives slightly different results from a difference of +6%. The reason is that if the larger mean is 6% greater than the smaller, then the smaller mean is 5.7% less than the larger, not 6% less. 3. measurement error, the inherent variability in the measurement process. The larger the measurement error, the larger the sample size needed to establish a given difference in means. Measurement error may be specified as either a standard deviation or an interval that contains a specified percentage of the data. Most sample size formulas are written in terms of the standard deviation, but researchers are often PC-SIZE G.E. Dallal PAGE 7 uncomfortable estimating standard deviations. PC-SIZE asks for the percentage of the data in the interval and then for the width of the interval. PC-SIZE then computes the normal percentile z that contains the specified probability between -z and +z. The standard deviation is estimated as the width divided by 2*z since, for normally distributed data, the specified percentage of observations will be within z standard deviations of the population mean. 4. power, the probability of detecting the specified difference. The larger the sample size, the greater the probability of establishing a given difference. The most commonly used value is 80%. Values less than 80% are usually judged too small to justify the cost of the experiment. Probabilities larger than 80% are nice, of course, but may result in a prohibitively large or costly experiment. Unequal sample sizes PC-SIZE allows the user to specify the ratio of the sample sizes, sample 1 to sample 2. Calculations are driven by sample 1. The estimate for the second sample size is obtained by dividing the first sample size by the specified ratio and reporting the smallest integer no less than this value. This procedure can lead to situations where the estimated sample sizes are not precisely in the specified ratio and where inverting the ratio will produce slightly different estimates. For example, with a difference of 3, within group standard deviation of 4.7, level of test 0.05, and power at alternative 0.90: Ratio Sample Sizes Power 2.50 91, 37 .90125 0.40 37, 93 .90307 Paired Data PC-SIZE asks for the expected difference and an estimate of the variability OF THE DIFFERENCES. Variability may be specified as a standard deviation (The standard deviation of the differences is also known as the standard error of the difference) or as an interval containing most of the differences. Often, a researcher will have some idea of the variability of the individual responses but not of their difference. If the correlation between the two responses can be estimated, the variance (the square of the standard deviation) of the differences can be obtained by using the formula PC-SIZE G.E. Dallal PAGE 8 var(X-Y) = var(X) + var(Y) - 2 * corr(X,Y) * sd(X) * sd(Y) , where var(X) is the variance of X and sd(X) is the standard deviation. If the variances of the two responses are equal, the relation reduces to sd(X-Y) = sd(X) * SQRT{2 * (1 - corr(X,Y))} . Thus, if the correlation between the two measurements is 0.5, the standard deviation of the differences will be equal to the standard deviation of the individual measurements. Proportions PC-SIZE uses formulas 3.18 and 3.19 of Fleiss (1981) to determine the sample size for a test of the equality of two proportions. (Note: The formulas and table in Fleiss (1981) differ substantially from those in Fleiss (1973).) This estimate is a large sample approximation based on standard normal theory. The user is prompted for the values of the proportions under the alternative to equality. Equal sample sizes: In some instances the values produced by PC-SIZE will be 1 greater than those in Fleiss's Table A.3. Fleiss has apparently taken the values produced by the formulas and rounded to the nearest integer. PC-SIZE reports the smallest integer not less than the results of the formulas. Unequal sample sizes: The user specifies the ratio of the sample sizes, sample 1 to sample 2. Calculations are driven by sample 1. The estimate for sample size 2 is obtained by dividing sample 1's size by the specified ratio and reporting the smallest integer no less than this value. This procedure can lead to situations where the estimated sample sizes are not precisely in the specified ratio and where switching the samples' labels and inverting the ratio will produce slightly different estimates. For example, (cf. Fleiss, 1981, p. 45) size of test 0.01, power at alternative 0.95: P1 P2 Ratio(1:2) Group 1 Group 2 0.25 0.40 2.00 531 266 0.40 0.25 0.50 266 532 Use sample sizes consistent with the specified ratio that are no less than the estimates produced by PC-SIZE. PC-SIZE G.E. Dallal PAGE 9 POWER OF SPECIFIC SAMPLE SIZES PC-SIZE can calculate the power of specific sample sizes. The alternative to equality of means or proportions is specified in the same way as when estimating a sample size. A starting value, final value, and an increment of the group 1 sample size must be specified, as well. For example, power can be calculated for group 1 sample sizes of 40 to 60 in increments of 5. If the starting and final values differ by no more than 1, no increment will be requested since it must be 1. If the starting and final values of the group 1 sample size are equal, the next prompt asks for the group 2 sample size itself rather than the ratio of the sample sizes, to make it easier to perform power calculations for published data. REPORTS PC-SIZE: Consultant generates reports of its calculations in a form suitable for inclusion in manuscripts and proposals. The report can be printed on screen or in an ASCII text file. Whenever reports are placed in a file, they are also printed on screen. The name of the output file cannot be changed during a PC-SIZE session. Reports of power calculations contain only the power for the last sample size requested. TECHNICAL DETAILS PC-SIZE: Consultant is written in Microsoft FORTRAN version 5.0. The program was compiled with all optimization turned off. Double precision arithmetic is used throughout. Initial Approximation--Confidence Intervals PC-SIZE uses the usual large sample approximation for the sample size for intervals whose expected length is close to the specified upper bound, namely (s*z/w)**2 for single population means and 2*(s*z/w)**2 for the difference between two population means based on independent samples, where s is the (common) population standard deviation, w is the half- width (length/2) of the interval, and z is the appropriate percentile of the standard normal distribution. The probability that the length of the interval is less than the PC-SIZE G.E. Dallal PAGE 10 specified upper bound is calculated at this initial sample size, truncated to the nearest multiple of 10 if the initial estimate exceeds 1000, and for successive increments until the required probability is achieved. The increments are 1 below sample sizes of 1,000, 2 below 5,000, 5 below 10,000, and 20, otherwise. The probabilities are obtained by using the inequalities preceeding expressions (7) and (9) in Kupper and Hafner for sample sizes less than 500. For sample sizes of 500 or more, the F(1,n-1) and F(1,2n-2) distributions are replaced by the chi-square distribution with 1 degree of freedom. Initial Approximation--Tests of Significance PC-SIZE invokes a "large sample approximation" (using a non-central chi-square distribution in place of the non- central F) to get a rough estimate the necessary sample size. If the large sample estimate is 500 or more, only the smallest integer no less than this estimate is reported. Otherwise, the non-central F distribution is used to obtain the exact sample size estimate. Calculations start at 1 less than the integer part of the large sample approximation and continue at increments of 1 until the required power is achieved. Proportions are handled differently; only the large sample approximation is used, as described above under "Proportions." Tests of Significance for Relative Differences PC-SIZE will calculate the sample size for comparing two population means that differ by a relative percent rather than by an absolute amount. The within group variability is specified as a percentage of the population mean. The delta method is applied to the logarithms of the measurements so that the usual calculations for absolute differences can be applied to the logs. If one mean is (100*(1+e))-% of the other mean and each population's standard deviation is (100*d)-% of its mean, then the expected difference of the natural logs is approximately log(1+e) and the common within group standard deviation of the logs is 'd'. PC-SIZE G.E. Dallal PAGE 11 Validation PC-SIZE: Consultant has been checked against a selection of entries form tables 2.3.4, 2.3.5, 2.3.6, and 2.4.1 of Cohen (1977) and table A.3 of Fleiss (1981). No differences have been observed except for those already mentioned for proportions where Fleiss appears to round fractional results while PC-SIZE reports the next largest integer. OTHER APPLICATIONS: Tests of Significance Two period cross-over design The two period cross-over design can be treated as a paired t-test with one fewer error degrees of freedom than for the paired t-test based on the same total number of observations. Proceed as for a paired t-test, obtaining a sample size of 'n'. For each sequence (AB, BA), take (n+1)/2 observations if 'n' is odd, 1+n/2 if n is even. Comparing a Single Sample to a Known Standard The mean of a single sample can be compared to a specified constant by using the paired t-test mode. Set the "expected difference" to the expected difference between the unknown population mean and the known standard. Set the "estimate of standard deviation of difference" to the estimated population standard deviation. Comparing Pre- to Post-treatment Changes In Two Independent Groups PC-SIZE can be used to compute sample size requirements for pre- and post-treatment comparisons between two regimens, where each subject receives only one regimen. This example is of particular interest because it involves comparing differences between differences, that is, the most important comparison is usually the difference between the pre/post differences for the two regimens. Suppose one regimen is an active agent (treatment) and the other is a placebo control (control). There are three questions that might be asked: 1. Does the treatment group change over time? PC-SIZE G.E. Dallal PAGE 12 2. Does the control group change over time? 3. Are the changes in the treatment and control groups the same? The sample sizes needed to answer questions 1 and 2 can be obtained by using PC-SIZE for paired samples. The pairs (or differences) are the pre- and post-treatment measurements (or their difference) made on the same individual. To obtain a sample size estimate, it is necessary to provide best guesses of the likely pre- to post-treatment changes and of the standard deviation of the changes. Question 3 asks about the difference between the pre/post changes (or the difference between the differences!). It is a question about independent samples in which the responses are differences between the pre- and post-treatment measurements. Example: Does vitamin C supplementation affect high density lipoprotein (HDL) levels? Suppose the tests are to be carried out at the 0.05 level of significance, 80-% power is required, and the best guesses about likely changes in HDL levels are 2 mg/dl for the controls (due to heightened awareness from participation in the study) and 6 mg/dl for those given vitamin C. Suppose, the standard deviation for a single HDL measurement made on a cross-section of individuals is known to be about 8 mg/dl and the correlation between pre- and post treatment HDL levels is expected to be around 0.70. Then the standard deviation of the differences is expected to be about 6.2 = [ 8 * SQRT(2 * (1 - 0.70)) ] . To determine whether the control group changes over time, the paired t test portion of PC-SIZE is used with an estimated effect of 2 and an estimated standard deviation of 6.2 to obtain a sample size estimate of 78 control subjects. To determine whether the vitamin C group changes over time, the paired t test portion of PC-SIZE is used with an estimated effect of 6 and an estimated standard deviation of 6.2 to obtain a sample size estimate of 11 vitamin C subjects. To determine whether the vitamin C and control groups differ, the independent samples portion of PC-SIZE is used with estimated means of 2 and 6 (or, equivalently, a difference of 4) and an estimated standard deviation of 6.2 to obtain a sample size estimate of 39 per group. Unless it is important to establish the change in the control group over time, the experiment would be carried out with 39 subjects per group. PC-SIZE G.E. Dallal PAGE 13 OTHER ISSUES Simplifying the Experiment A useful approach to sample size estimation is to reduce an experiment to the more important two group comparisons and use the largest sample size required by these critical comparisons as the common sample size for all groups. In this way, there will be a good chance of a successful outcome if your estimates are correct. This method often uncovers ways in which time and resources can be saved--by eliminating sets of treatments and conditions that, upon reflection, are not essential to the research. Repeated Measurements Over Time Many experiments involve measuring individuals' responses over time. Sample size estimation can be carried out in terms of the summaries of the responses that will be subjected to analysis. Examples of such summaries are time to peak, number of peaks, average response, and area under the curve. Two-Tailed Tests PC-SIZE treats all tests for equality of population means and proportions as two-tailed tests, that is, it assumes that the null hypothesis of equality will be rejected regardless of which sample has the greater mean or proportion. No provision is made for one-tailed tests, where the hypothesis of equality is rejected only if one particular group has the greater mean or proportion. One-tailed tests are inherently unsound. There are no situations where differences in a particular direction are uninteresting. The usual example given to justify the use of one-tailed tests is that of comparing a new treatment to an established treatment. The test should be one-tailed, the argument goes, because the only way the new treatment will displace the standard treatment is if the new treatment is shown to be better; significant results favoring the standard treatment do not matter. The reasoning is flawed. We certainly want to know if the new treatment performs significantly worse than the standard treatment, if only to ask why the new treatment was proposed in the first place. To answer this criticism of one-tailed tests, some analysts have proposed the use of unbalanced two-tailed tests, tests for which the rejection of equality of means requires greater differences in one direction than the other. (In the precious example, an 0.05 level test might be constructed from outcomes that have a probability of 0.04 favoring the new PC-SIZE G.E. Dallal PAGE 14 treatment and 0.01 favoring the standard treatment.) But because there is no standard method for choosing the probabilities, analysts have stayed with the usual two-tailed test which assigns the same probability to each tail. ALGORITHMS PC-SIZE makes use of the following published routines, modified to run in double precision: Best DJ and Roberts DE (1975), "Algorithm AS 91. The Percentage Points of the Chi-squared Distribution," Applied Statistics, 24, 385-388. Bhattacharjee GP (1970), "Algorithm AS 32. The Incomplete Gamma Integral," Applied Statistics, 19, 285-287. Cran GW, Martin KJ and Thomas GE (1977), "Remark AS R19 and Algorithm AS 109. A Remark On Algorithms AS 63: The Incomplete Beta Integral, and AS 64: Inverse of the Incomplete Beta Function Ratio," Applied Statistics, 26, 111-114. Hill ID (1973), "Algorithm AS 66. The normal integral," Applied Statistics, 22, 424-427. Majumder KL and Bhattacharjee GP (1973), "Algorithm AS 63. The Incomplete Beta Integral," Applied Statistics, 22, 409- 411. Odeh RE and Evans JO (1974), "Algorithm AS 70. The Percentage Points of the Normal Distribution," Applied Statistics, 23, 96-97. and a FORTRAN translation of Pike MC and Hill ID (1966), "Algorithm 291. Logarithm of the Gamma Function," Communications of the Association for Computing Machinery, 9, 684. PC-SIZE G.E. Dallal PAGE 15 REFERENCES Cohen J (1977), Statistical Power Analysis for the Behavioral Sciences, revised edition. New York: Academic Press. Dallal GE (1986), "PC-SIZE: A Program for Sample-Size Determinations," The American Statistician, 40, 52. Fleiss JL (1973). Statistical Methods for Rates and Proportions. New York: John Wiley & Sons, Inc. Fleiss JL (1981). Statistical Methods for Rates and Proportions, 2-nd ed. New York: John Wiley & Sons, Inc. Kupper LL and Hafner KB (1989), "How Appropriate Are Popular Sample Size Formulas?," The American Statistician, 43, 101-105. UPDATE HISTORY Version 1.01 allows for printing of larger numbers in some report fields. for Size/Relative/Interval, the length of the interval is now reported back properly. (Sample size calculations in version 1.00 were correct.) some prompts rewritten for clarity. PC-SIZE G.E. Dallal DUPLICATION AND SHAREWARE NOTICES You may distribute unmodified copies of PC-SIZE: Consultant and its documentation provided there is no charge beyond a duplication fee not to exceed $5. PC-SIZE: Consultant is shareware. If you find the program to be useful, a non-exclusive license fee of $15 should be sent to the author. Instructors may duplicate a licensed copy for classroom use for a fee of $5 per student. Students may keep their copies at the end of the course. REGISTRATION FORM Licensed from: Gerard E. Dallal 54 High Plain Road Andover, MA 01810 Date: / / -------------------------------------------------------- Qty Fee Fee ITEM each extended PC-SIZE: Consultant x $15 = (license fee) ----- -------- student users x $5 = ----- -------- distribution disk x $5 = ----- -------- SUBTOTAL -------- 5% Sales Tax (MA residents only) -------- TOTAL -------- Please make check payable to Gerard E. Dallal You may keep a copy of this invoice for your tax records.